Skip to content

USHIFT-6808: C2CC Chaos Tests (Restarting Services, toggling ethernet)#6930

Open
pmtk wants to merge 5 commits into
openshift:mainfrom
pmtk:c2cc/chaos-testing/restarts
Open

USHIFT-6808: C2CC Chaos Tests (Restarting Services, toggling ethernet)#6930
pmtk wants to merge 5 commits into
openshift:mainfrom
pmtk:c2cc/chaos-testing/restarts

Conversation

@pmtk

@pmtk pmtk commented Jun 24, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • New Features

    • Added new chaos/resilience test coverage for cross-cluster environments, including service restarts, network disruption, and VM network outages.
    • Expanded validation to cover cross-cluster connectivity, DNS, and recovery across multiple clusters.
    • Added support for running targeted test tags and more flexible multi-cluster setup.
  • Bug Fixes

    • Improved cluster health checks and recovery validation to better detect and confirm full restoration after disruptions.
    • Updated IP routing verification to cover expected rules more reliably across clusters.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 24, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 24, 2026

Copy link
Copy Markdown

@pmtk: This pull request references USHIFT-6808 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Adds a C2CC chaos resilience test suite (chaos.robot) with seven disruption scenarios (MicroShift restart, OVN-K pod deletion, NM restart, NIC outage, and combinations). Extends c2cc.resource with cluster health, IP-rule, infrastructure, connectivity, NIC disruption, and lifecycle keywords. Updates c2cc_common.sh to support tag filtering and additional VM name variables. Moves shared keywords from probe.robot into c2cc.resource.

Changes

C2CC Chaos Suite and Shared Resource Extensions

Layer / File(s) Summary
c2cc.resource: CIDR maps, imports, and IP rule verification
test/resources/c2cc.resource
Imports libvirt.resource, adds ${POD_CIDR_MAP}, ${SVC_CIDR_MAP}, @{ALL_CLUSTERS}, and replaces fixed routing checks with Verify Service IP Rules and Verify All IP Rules.
c2cc.resource: dynamic RemoteCluster state and health-gate keywords
test/resources/c2cc.resource
Replaces fixed-count RemoteCluster assertion with ALL_CLUSTERS-derived count; adds Verify RemoteCluster State By Name, RemoteCluster CR Name From IP, Verify RemoteCluster Unhealthy On Observers, and Ensure All Clusters Healthy.
c2cc.resource: infrastructure, connectivity, DNS, and OVN-K keywords
test/resources/c2cc.resource
Adds per-remote CIDR/route/rule verification, cross-cluster connectivity and DNS helpers, and OVN-K pod readiness wait keywords.
c2cc.resource: NIC disruption and remote-cluster lifecycle keywords
test/resources/c2cc.resource
Adds Disable/Enable All NICs For VM via virsh and Deregister Remote Cluster/Reconnect To Cluster for SSH teardown and re-registration after disruption.
chaos.robot: suite settings, test cases, and keywords
test/suites/c2cc/chaos.robot
New Robot Framework suite with seven chaos test cases and Verify Full Recovery On Clusters, Restore NICs And Reconnect, Setup, and Teardown keywords.
Chaos scenario script and shell harness extensions
test/scenarios-bootc/c2cc/el98-src@c2cc-chaos.sh, test/bin/c2cc_common.sh
New scenario script wires chaos.robot with TEST_RANDOMIZATION=none; c2cc_common.sh adds include_tags parameter, include_args construction, and HOST2_VM_NAME/HOST3_VM_NAME forwarding.
probe.robot: keyword refactor
test/suites/c2cc/probe.robot
Removes Verify RemoteCluster State By Name and RemoteCluster CR Name From IP (moved to c2cc.resource) and adds Get RemoteCluster Errors By Name.

Sequence Diagram(s)

sequenceDiagram
  participant ChaosTest as Chaos Test Case
  participant c2cc.resource as c2cc.resource Keywords
  participant Cluster as Target Cluster VM
  participant Observers as Observer Clusters

  rect rgba(70, 130, 180, 0.5)
    note over ChaosTest,Observers: Pre-disruption gate
    ChaosTest->>c2cc.resource: Ensure All Clusters Healthy
    c2cc.resource->>Cluster: Wait Until Keyword Succeeds — Verify RemoteCluster State (Healthy)
  end

  rect rgba(180, 80, 80, 0.5)
    note over ChaosTest,Cluster: Fault injection
    ChaosTest->>Cluster: Inject disruption (restart service / delete pod / Disable All NICs For VM)
    ChaosTest->>Observers: Verify RemoteCluster Unhealthy On Observers
  end

  rect rgba(60, 160, 60, 0.5)
    note over ChaosTest,Observers: Recovery verification
    ChaosTest->>c2cc.resource: Wait Until Keyword Succeeds — Ensure All Clusters Healthy
    ChaosTest->>c2cc.resource: Verify Full Recovery On Clusters
    c2cc.resource->>Cluster: Verify C2CC Infrastructure On Cluster
    c2cc.resource->>Cluster: Verify Cross Cluster Connectivity
    c2cc.resource->>Cluster: Verify Cross Cluster DNS
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • openshift/microshift#6827: Introduced c2cc_common.sh helpers that this PR extends with tag filtering and additional VM name variables.
  • openshift/microshift#6894: Also modifies c2cc_run_tests in c2cc_common.sh to extend the run_tests argument list with dynamic --variable inputs.

Suggested reviewers

  • vanhalenar
🚥 Pre-merge checks | ✅ 12 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning FAIL: chaos.robot still uses Command On Cluster for NetworkManager restart and sets DISABLED_VM after NIC shutdown, so SSH-fault cleanup can be skipped. Use Disruptive Command On Cluster for NetworkManager restarts and set DISABLED_VM before disabling NICs (or in a finally block) so teardown always restores interfaces.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning c2cc_common.sh hardcodes 10.x pod/service CIDRs for all clusters, and the new chaos scenario uses them with no IPv6 override. Add IPv6/dual-stack CIDR handling for the chaos scenario or mark it IPv4-only and verify in the IPv6 disconnected CI job.
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding C2CC chaos tests for service restarts and NIC/ethernet disruption.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Changed test names are static Robot case titles; no dynamic identifiers or Ginkgo-style titles appear in the modified files.
Microshift Test Compatibility ✅ Passed PASS: The PR adds only Robot/bash C2CC chaos/probe helpers and scenarios; no new Ginkgo It/Describe tests or MicroShift-unsafe OpenShift APIs were introduced.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PASS: This PR adds Robot Framework/Shell C2CC chaos tests, not new Ginkgo e2e tests; the SNO-specific check doesn’t apply.
Topology-Aware Scheduling Compatibility ✅ Passed Changed files are test scripts/resources only; no pod scheduling constraints, affinity, selectors, or topology logic were introduced.
Ote Binary Stdout Contract ✅ Passed PASS: The PR only touches shell/Robot test assets; no Go main/TestMain/BeforeSuite code or stdout writes were added.
No-Weak-Crypto ✅ Passed Reviewed changed files; found no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, custom crypto, or secret/token comparisons—only generic string checks.
Container-Privileges ✅ Passed Only test shell/Robot files changed; no K8s manifests or securityContext settings with privileged/hostPID/hostNetwork/hostIPC/CAP_SYS_ADMIN/allowPrivilegeEscalation were added.
No-Sensitive-Data-In-Logs ✅ Passed No new logging of secrets/PII found; the PR only adds operational test helpers and passes IPs/paths internally.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 24, 2026
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai coderabbitai Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 24, 2026
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pmtk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/bin/c2cc_common.sh`:
- Around line 186-193: The `setup_hosts` logic is handling `get_host_ip`
failures with early returns, but `full_vm_name` for `host2_vm` and `host3_vm` is
not checked the same way. Update the assignments around `full_vm_name`,
`host2_vm`, and `host3_vm` so a failure aborts immediately instead of
propagating empty `HOST2_VM_NAME`/`HOST3_VM_NAME` values into the NIC chaos
keywords. Keep the error handling consistent with the existing `get_host_ip`
pattern by returning from the function on lookup failure before the `readonly`
exports.

In `@test/resources/c2cc.resource`:
- Around line 562-563: The deregistration cleanup currently always calls
SSHLibrary.Close Connection even when SSHLibrary.Switch Connection fails, which
can close the wrong active SSH session on повторное deregistration. Update the
teardown logic in the affected resource so Close Connection only runs when the
switch to ${alias} succeeds, using the existing SSHLibrary.Switch Connection and
SSHLibrary.Close Connection keywords together with a conditional or status check
to preserve the current session when the alias is missing.
- Around line 541-549: In Disable All NICs For VM, add a fail-fast guard after
Get Vnet Devices For MicroShift Host returns ${vnet_ifaces} so the test errors
immediately when no vnet devices are found instead of silently doing nothing.
Check for an empty list before the FOR loop, and raise a clear failure that
includes ${vm_name} and the lack of discovered NICs so the issue is obvious in
the test output.
- Around line 483-488: The precondition gate in Ensure All Clusters Healthy only
verifies RemoteCluster CR status and can let a degraded local MicroShift/OVN-K
environment pass. Update this keyword to run the actual cluster healthcheck
before fault injection, alongside or instead of Verify RemoteCluster State, so
the setup blocks unless the real cluster health is healthy for each alias
(cluster-a, cluster-b, cluster-c).
- Around line 170-184: The IP rule checks in Verify Service IP Rules and Verify
All IP Rules only validate the rule targets, so they can miss wrong ordering.
Update these Robot Framework keywords to also assert the expected priority value
for each rule returned by Command On Cluster, using the existing `${stdout}`
checks with Should Contain against the full rule strings so table 200/201
entries are verified at the correct priority.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9ab20d10-7e3c-468d-a472-1cea8e83ce11

📥 Commits

Reviewing files that changed from the base of the PR and between bfeabef and 64e01f1.

📒 Files selected for processing (6)
  • test/bin/c2cc_common.sh
  • test/resources/c2cc.resource
  • test/scenarios-bootc/c2cc/el98-src@c2cc-chaos.sh
  • test/suites/c2cc/chaos.robot
  • test/suites/c2cc/cleanup.robot
  • test/suites/c2cc/probe.robot
💤 Files with no reviewable changes (2)
  • test/suites/c2cc/cleanup.robot
  • test/suites/c2cc/probe.robot

Comment thread test/bin/c2cc_common.sh
Comment thread test/resources/c2cc.resource
Comment thread test/resources/c2cc.resource
Comment thread test/resources/c2cc.resource
Comment thread test/resources/c2cc.resource Outdated
pmtk added a commit to pmtk/microshift that referenced this pull request Jun 24, 2026
- c2cc_common.sh: add || return 1 to full_vm_name calls for consistency
- c2cc.resource: fail fast when no vnet interfaces found for VM
- c2cc.resource: only close SSH connection after successful switch

Co-Authored-By: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pmtk added a commit to pmtk/microshift that referenced this pull request Jun 24, 2026
- c2cc_common.sh: add || return 1 to full_vm_name calls for consistency
- c2cc.resource: fail fast when no vnet interfaces found for VM
- c2cc.resource: only close SSH connection after successful switch
@pmtk pmtk force-pushed the c2cc/chaos-testing/restarts branch from 30371f0 to 0d99900 Compare June 24, 2026 13:41
@pmtk

pmtk commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

/test ?

@pmtk

pmtk commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm

pmtk added 5 commits June 24, 2026 18:57
MicroShift, NetworkManager, OVN-K restarts.
NIC disabling.
- c2cc_common.sh: add || return 1 to full_vm_name calls for consistency
- c2cc.resource: fail fast when no vnet interfaces found for VM
- c2cc.resource: only close SSH connection after successful switch
@pmtk pmtk force-pushed the c2cc/chaos-testing/restarts branch from 0d99900 to bcfed52 Compare June 24, 2026 16:58
@pmtk

pmtk commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

/test e2e-aws-tests-bootc-c2cc e2e-aws-tests-bootc-c2cc-arm

@pmtk pmtk marked this pull request as ready for review June 24, 2026 17:01
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 24, 2026
@openshift-ci openshift-ci Bot requested review from jogeo and vanhalenar June 24, 2026 17:03

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/suites/c2cc/chaos.robot`:
- Line 55: The cluster command in the chaos test is too disruptive for Command
On Cluster because restarting NetworkManager can drop the SSH session before the
call returns. Update the test in chaos.robot to use Disruptive Command On
Cluster for the systemctl restart NetworkManager step, using the existing
cluster-c command block so the session loss is handled safely.
- Around line 64-76: The disabled-state flag is set too late in the chaos test
flow, so a failure inside Disable All NICs For VM can leave NICs down while
Restore NICs And Reconnect sees no disabled VM to clean up. In the c2cc chaos
scenario, set ${DISABLED_VM} before calling Disable All NICs For VM and keep the
existing reset after recovery; use the existing keywords Disable All NICs For
VM, Restore NICs And Reconnect, and ${DISABLED_VM} to locate the change.
- Around line 149-157: The setup flow is missing validation for VM name
environment variables, which lets NIC-outage tests fail later with unclear
errors. Update Check Required Env Variables to also require HOST2_VM_NAME and
HOST3_VM_NAME alongside the existing host/IP/port/kubeconfig checks, and ensure
the Setup sequence still calls that validation before Register Remote Cluster so
failures surface early with a clear message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8b370a02-f8f1-4408-b570-8eb1581bfc0e

📥 Commits

Reviewing files that changed from the base of the PR and between 0d99900 and bcfed52.

📒 Files selected for processing (5)
  • test/bin/c2cc_common.sh
  • test/resources/c2cc.resource
  • test/scenarios-bootc/c2cc/el98-src@c2cc-chaos.sh
  • test/suites/c2cc/chaos.robot
  • test/suites/c2cc/probe.robot
💤 Files with no reviewable changes (1)
  • test/suites/c2cc/probe.robot
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/bin/c2cc_common.sh
  • test/resources/c2cc.resource

[Documentation] Restart NetworkManager on cluster-c.
... Verify kernel routes/rules are restored and connectivity recovers.
[Setup] Ensure All Clusters Healthy
Command On Cluster cluster-c systemctl restart NetworkManager

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C5 '^(Command On Cluster|Disruptive Command On Cluster)\b' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 1266


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== test/suites/c2cc/chaos.robot =="
wc -l test/suites/c2cc/chaos.robot
sed -n '1,180p' test/suites/c2cc/chaos.robot

echo
echo "== test/resources/c2cc.resource =="
wc -l test/resources/c2cc.resource
sed -n '1,180p' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 16640


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== test/suites/c2cc/chaos.robot =="
wc -l test/suites/c2cc/chaos.robot
sed -n '1,180p' test/suites/c2cc/chaos.robot

echo
echo "== test/resources/c2cc.resource =="
wc -l test/resources/c2cc.resource
sed -n '1,180p' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 16640


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== test/suites/c2cc/chaos.robot =="
wc -l test/suites/c2cc/chaos.robot
sed -n '1,180p' test/suites/c2cc/chaos.robot

echo
echo "== test/resources/c2cc.resource =="
wc -l test/resources/c2cc.resource
sed -n '1,180p' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 16640


Switch this to Disruptive Command On Cluster systemctl restart NetworkManager can tear down the SSH session before Command On Cluster returns, so this test can fail intermittently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/suites/c2cc/chaos.robot` at line 55, The cluster command in the chaos
test is too disruptive for Command On Cluster because restarting NetworkManager
can drop the SSH session before the call returns. Update the test in chaos.robot
to use Disruptive Command On Cluster for the systemctl restart NetworkManager
step, using the existing cluster-c command block so the session loss is handled
safely.

Comment on lines +64 to +76
${vnet_ifaces} Disable All NICs For VM ${HOST2_VM_NAME}
VAR ${DISABLED_VM} ${HOST2_VM_NAME} scope=TEST
VAR @{DISABLED_IFACES} @{vnet_ifaces} scope=TEST
Verify RemoteCluster Unhealthy On Observers ${HOST2_IP} cluster-a cluster-c
Enable All NICs For VM ${HOST2_VM_NAME} ${vnet_ifaces}
Reconnect To Cluster cluster-b ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B}
... timeout=${RECOVERY_TIMEOUT}
VAR ${DISABLED_VM} ${EMPTY} scope=TEST
Wait Until Keyword Succeeds ${RECOVERY_TIMEOUT} ${RECOVERY_RETRY}
... Verify Cluster Is Healthy cluster-b
Verify Full Recovery On Clusters cluster-b
[Teardown] Restore NICs And Reconnect
... ${HOST2_VM_NAME} cluster-b ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C8 '^Disable All NICs For VM\b' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 1022


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '541,575p' test/resources/c2cc.resource
printf '\n---\n'
rg -n "DISABLED_VM|Restore NICs And Reconnect|Disable NICs|Reconnect To Cluster" test/suites/c2cc/chaos.robot test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 2965


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,140p' test/suites/c2cc/chaos.robot

Repository: openshift/microshift

Length of output: 7246


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,220p' test/suites/c2cc/chaos.robot
printf '\n--- resource snippet ---\n'
sed -n '541,620p' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 10455


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C6 "Restore NICs And Reconnect|DISABLED_VM" test/resources/c2cc.resource test/suites/c2cc/chaos.robot

Repository: openshift/microshift

Length of output: 6798


🏁 Script executed:

#!/bin/bash
set -euo pipefail

grep -nR "DISABLED_VM" -n test/resources/c2cc.resource test/suites/c2cc/chaos.robot

Repository: openshift/microshift

Length of output: 646


🏁 Script executed:

#!/bin/bash
set -euo pipefail

wc -l test/resources/c2cc.resource
sed -n '620,720p' test/resources/c2cc.resource

Repository: openshift/microshift

Length of output: 191


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C8 "Restore NICs And Reconnect|Disable All NICs For VM|DISABLED_VM|DISABLED_IFACES" test

Repository: openshift/microshift

Length of output: 9865


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path("test/resources/c2cc.resource")
for i, line in enumerate(p.read_text().splitlines(), 1):
    if "Restore NICs And Reconnect" in line or "DISABLED_VM" in line or "DISABLED_IFACES" in line:
        start = max(1, i-4)
        end = min(i+12, sum(1 for _ in p.open()))
        print(f"--- around line {i} ---")
        for j, l in enumerate(p.read_text().splitlines()[start-1:end], start):
            print(f"{j}: {l}")
PY

Repository: openshift/microshift

Length of output: 158


Move the disabled-state flag before the NIC-disable call. In test/suites/c2cc/chaos.robot, Disable All NICs For VM can fail after some NICs are already down, but ${DISABLED_VM} stays empty until the call returns, so Restore NICs And Reconnect skips cleanup on that path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/suites/c2cc/chaos.robot` around lines 64 - 76, The disabled-state flag
is set too late in the chaos test flow, so a failure inside Disable All NICs For
VM can leave NICs down while Restore NICs And Reconnect sees no disabled VM to
clean up. In the c2cc chaos scenario, set ${DISABLED_VM} before calling Disable
All NICs For VM and keep the existing reset after recovery; use the existing
keywords Disable All NICs For VM, Restore NICs And Reconnect, and ${DISABLED_VM}
to locate the change.

Comment on lines +149 to +157
Setup
[Documentation] Set up clusters and deploy test workloads.
Check Required Env Variables
Login MicroShift Host
Setup Kubeconfig
Register Local Cluster cluster-a
Register Remote Cluster cluster-b ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B}
Register Remote Cluster cluster-c ${HOST3_IP} ${HOST3_SSH_PORT} ${KUBECONFIG_C}
Deploy Test Workloads

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C10 '^Check Required Env Variables\b' test/resources/c2cc.resource
rg -nP 'HOST2_VM_NAME|HOST3_VM_NAME' test/bin/c2cc_common.sh

Repository: openshift/microshift

Length of output: 264


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- test/resources/c2cc.resource: Check Required Env Variables ---'
rg -n -C 20 '^Check Required Env Variables\b|HOST2_VM_NAME|HOST3_VM_NAME|VM_NAME' test/resources/c2cc.resource

echo
echo '--- test/bin/c2cc_common.sh around HOST2/3 VM vars ---'
rg -n -C 20 'HOST2_VM_NAME|HOST3_VM_NAME|host2_vm|host3_vm|KUBECONFIG_B|KUBECONFIG_C' test/bin/c2cc_common.sh

Repository: openshift/microshift

Length of output: 225


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '1,260p' test/resources/c2cc.resource | nl -ba | sed -n '1,260p'

Repository: openshift/microshift

Length of output: 198


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- files of interest ---'
git ls-files 'test/resources/*' 'test/bin/*' | sed -n '1,200p'

echo
echo '--- search for Check Required Env Variables across repo ---'
rg -n -C 4 'Check Required Env Variables' test

echo
echo '--- search for HOST2_VM_NAME / HOST3_VM_NAME across repo ---'
rg -n -C 4 'HOST2_VM_NAME|HOST3_VM_NAME' test

Repository: openshift/microshift

Length of output: 44477


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Read the relevant resource file with line numbers, without relying on external tools like nl.
python3 - <<'PY'
from pathlib import Path
p = Path('test/resources/c2cc.resource')
text = p.read_text().splitlines()
for i, line in enumerate(text, 1):
    if 'Check Required Env Variables' in line or 'HOST2_VM_NAME' in line or 'HOST3_VM_NAME' in line:
        start = max(1, i-20)
        end = min(len(text), i+40)
        for j in range(start, end+1):
            print(f"{j:4d}: {text[j-1]}")
        break
PY

Repository: openshift/microshift

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- matches for the keyword ---'
rg -n -C 6 '^Check Required Env Variables\b|Check Required Env Variables' test

echo
echo '--- c2cc resource files ---'
git ls-files 'test/resources/*c2cc*' 'test/resources/*.resource' 'test/bin/*c2cc*' | sed -n '1,200p'

echo
echo '--- c2cc.resource excerpts around VM-name vars if present ---'
python3 - <<'PY'
from pathlib import Path
p = Path('test/resources/c2cc.resource')
if p.exists():
    text = p.read_text().splitlines()
    for i, line in enumerate(text, 1):
        if 'HOST2_VM_NAME' in line or 'HOST3_VM_NAME' in line or 'Check Required Env Variables' in line:
            start = max(1, i-25)
            end = min(len(text), i+35)
            for j in range(start, end+1):
                print(f"{j:4d}: {text[j-1]}")
            print('---')
PY

Repository: openshift/microshift

Length of output: 50376


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find where the setup keyword actually lives and how the VM-name vars are defined.
rg -n -C 8 'Check Required Env Variables|HOST2_VM_NAME|HOST3_VM_NAME' test/suites test/resources test/bin

Repository: openshift/microshift

Length of output: 50376


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- all occurrences of HOST2_VM_NAME / HOST3_VM_NAME ---'
rg -n -C 3 'HOST2_VM_NAME|HOST3_VM_NAME' .

echo
echo '--- all occurrences of Check Required Env Variables ---'
rg -n -C 12 '^Check Required Env Variables\b' .

echo
echo '--- likely c2cc setup file around variable definitions ---'
python3 - <<'PY'
from pathlib import Path
for path in [Path('test/bin/c2cc_common.sh'), Path('test/resources/c2cc.resource')]:
    if path.exists():
        print(f'\n### {path}')
        text = path.read_text().splitlines()
        for i, line in enumerate(text, 1):
            if 'HOST2_VM_NAME' in line or 'HOST3_VM_NAME' in line or 'Check Required Env Variables' in line:
                start = max(1, i-20)
                end = min(len(text), i+30)
                for j in range(start, end+1):
                    print(f"{j:4d}: {text[j-1]}")
                print('---')
PY

Repository: openshift/microshift

Length of output: 12016


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
candidates = [p for p in Path('test').rglob('*') if p.is_file() and ('c2cc' in p.name or 'c2cc' in str(p))]
for p in sorted(candidates):
    print(p)
PY

Repository: openshift/microshift

Length of output: 1195


Add VM-name checks to Check Required Env Variables. HOST2_VM_NAME and HOST3_VM_NAME default to ${EMPTY} but aren’t validated there, so the NIC-outage cases fail later with a vague error.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/suites/c2cc/chaos.robot` around lines 149 - 157, The setup flow is
missing validation for VM name environment variables, which lets NIC-outage
tests fail later with unclear errors. Update Check Required Env Variables to
also require HOST2_VM_NAME and HOST3_VM_NAME alongside the existing
host/IP/port/kubeconfig checks, and ensure the Setup sequence still calls that
validation before Register Remote Cluster so failures surface early with a clear
message.

@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@pmtk: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants